Hyperspectral imaging sensor

ABSTRACT

Systems, methods, and computer readable media for hyperspectral imaging are provided. An example hyperspectral imaging sensor system to identify item composition includes an imager to capture hyperspectral imaging data of one or more items with respect to a target. The example includes a sensor to be positioned with respect to the target to trigger capture of the image data by the imager based on a characteristic of the target. The example includes a processor to prepare the captured imaging data for analysis to at least: identify the one or more items; determine composition of the one or more items; calculate an energy intake associated with the one or more items; and classify the target based on the energy intake.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application claims priority benefit as a National StageApplication of International Application No. PCT/US18/21832, filed Mar.9, 2018, which claims the priority benefit of U.S. Patent ApplicationNo. 62/548,194, filed Aug. 21, 2017 and U.S. Patent Application No.62/469,352 filed Mar. 9, 2017, the contents of which are incorporatedherein by reference in their entireties.

BACKGROUND

In the past few years, occurrence rates of metabolic disorders haveskyrocketed. For example, cardiovascular heart disease is a leadingcause of death around the world. While many aspects of these diseasesare genetic, a significant portion of the risk can be mitigated throughproper diet and exercise. However, maintaining a proper diet, for manypeople, is often difficult to sustain. Difficulty sustaining weight lossposes a challenge to both continuing a weight-loss trajectory andmaintaining the weight lost. It is very difficult to determine the exactnutritional composition of a random food item unless all the ingredientsare prepared from start to finish with meticulous attention tonutritional facts and portion sizes consumed. Additionally, gatheringingredient nutritional facts and portion sizes makes the assumption thatindividuals attempting to keep a healthy diet will be preparing all oftheir own meals. Even more complications are introduced when theindividual chooses to eat outside of their homes or otherwise consumemeals prepared by others. For at least these reasons, a technology isneeded that provides accurate nutritional information of food consumedand integrates seamlessly into the user's life.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates four spectral signatures separating fat-free milkfrom full-fat milk.

FIG. 2 illustrates an example hyperspectral imaging sensor system.

FIG. 3 illustrates an example mobile health system generating a caloricneed.

FIG. 4 illustrates further detail regarding the example hyperspectralimaging sensor system of FIG. 2.

FIG. 5 depicts an example implementation of a video camera trigger.

FIG. 6 depicts an example implementation of an image calibrator.

FIG. 7 depicts an example implementation of a food quantifier.

FIG. 8 depicts an example implementation of a food identifier.

FIG. 9 depicts an example implementation of an energy intake calculator.

FIG. 10 depicts an example implementation of an energy expenditurecalculator.

FIG. 11 show a flow diagram of an example method of hyperspectral imageanalysis using the example system of FIGS. 4-10.

FIG. 12 is a block diagram of an example processor platform 1200 capableof executing instructions to implement the example apparatus, systemsand methods disclosed and described herein.

FIG. 13 shows an example snapshot imager apparatus.

FIG. 14 depicts an example mobile infrastructure to communicate andprocess data from a wearable device to a cloud-based infrastructure.

FIG. 15A depicts example food samples that can be imaged and analyzedfor caloric content and composition.

FIG. 15B depicts example food items differentiated based on theirassociated spectrum characteristics.

FIG. 15C depicts spectra associated with stew having medium fat imagesin three different light conditions.

FIG. 15D depicts spectra associated with stew having high fat images inthree different light conditions.

FIG. 16 illustrates a table showing results of food calorie contentidentification using RGB and hyper-spectral image analysis.

FIG. 17 shows an example confusion matrix including labels classifiedusing one or more SVMs.

FIG. 18A shows an example confusion matrix for hyperspectral image dataand analysis.

FIG. 18B shows an example confusion matrix for RGB image data andanalysis.

FIG. 19 shows an example of F-measures obtained for classification offat in eggs using hyperspectral analysis alone, RGB analysis alone, anda fused hyperspectral/RGB analysis.

FIG. 20 illustrates example systems and methods described herein appliedto a chewing and eating detection framework.

DETAILED DESCRIPTION OF CERTAIN EMBODIMENTS

The novel features of a device of this disclosure are set forth withparticularity in the appended claims. A better understanding of thefeatures and advantages of this disclosure will be obtained by referenceto the following detailed description that sets forth illustrativeexamples, in which the principles of a device of this disclosure areutilized, and the accompanying drawings.

Overview

In the following detailed description, reference is made to theaccompanying drawings that form a part hereof, and in which is shown byway of illustration specific examples that may be practiced. Theseexamples are described in sufficient detail to enable one skilled in theart to practice the subject matter, and it is to be understood thatother examples may be utilized and that logical, mechanical, electricaland other changes may be made without departing from the scope of thesubject matter of this disclosure. The following detailed descriptionis, therefore, provided to describe an exemplary implementation and notto be taken as limiting on the scope of the subject matter described inthis disclosure. Certain features from different aspects of thefollowing description may be combined to form yet new aspects of thesubject matter discussed below.

When introducing elements of various embodiments of the presentdisclosure, the articles “a,” “an,” “the,” and “said” are intended tomean that there are one or more of the elements. The terms “comprising,”“including,” and “having” are intended to be inclusive and mean thatthere may be additional elements other than the listed elements.

In recent years, there has been increase in obesity rates along with itsadverse health effects stemming primarily from energy imbalance. As aresult, rates of metabolic disorders such as cardiovascular heartdisease have skyrocketed. Certain examples provide automated,intelligent wearable systems and associated methods of use that can aidindividuals in more reliably monitoring their diet to prevent and fightoff these disorders. Certain examples provide advanced cameras combinedwith optimized image processing algorithms and expanding food-imagedatabases to form an automated diet monitoring system to estimatecaloric and nutrient intake through automated food identification.

Certain examples provide image data analysis to evaluate food items.Red-Green-Blue (RGB)-color images can detect food items with reasonableaccuracy, but RGB image analysis fails to distinguish between a largevariability of food (e.g., ingredient) composition within a single fooditem. Instead, in certain examples, using hyperspectral imaging providesgreater value (e.g., accuracy and precision, etc.) in differentiatingwithin-food item variability. That is, hyperspectral imaging providesenhanced precision in evaluating nutritional composition compared totypical RGB-color images. A hyperspectral camera captures images offoods prepared and cooked with varying levels of fat, protein,carbohydrate, salt and sugar content. Given the large dimensionality ofeach hyperspectral image, a Principle Component Analysis (PCA) isapplied to reduce the dimensionality of the data, and a Support VectorMachine (SVM) classifier with a Radial Basis Function (RBF) kernel isused to classify into sub-categories for each within-food item. Certainexamples greatly enhance accuracy of distinguishing between ingredientsor other food components compared to methods that only use RGB-colorimages. Certain examples using hyperspectral image analysis to betterclassify samples based on within-food caloric content.

Certain examples provide mobile hyperspectral imaging sensors to analyzeobject composition in conjunction with a passive sensor. For example,mobile health (mHealth) systems are provided to detect food items usinga low-battery hyperspectral imaging sensor and passive sensing necklace,bracelet, etc., used to detect proximity, gesture, etc., to trigger thehyperspectral imaging sensor. Such portable hyperspectral imagingsensors can be used to target undernutrition in low-income countries andobesity and food allergies in the U.S. and other developed countries,for example. While examples are disclosed and described herein in thecontext of food items and associated composition and caloricinformation, a determination of composition can be facilitated for otheritems additionally or in the alternative.

Despite several efforts in the field of image recognition, conventionalimaging technologies that only acquire morphology are not adequate forthe accurate detection and assessment of characteristics or intrinsicproperties of a product. Spectroscopic imaging covering the visible andnear-IR spectrum (e.g., from 400 nm to 1100 nm) can help identifyobjects, such as food items, etc. Certain examples provide hyperspectralimaging fused with food database information to determine compositionand caloric intake.

Hyperspectral imaging collects and processes information from across theelectromagnetic spectrum. Hyperspectral imaging determines a spectrumassociated with each pixel in an image being analyzed. The spectruminformation can be processed for each pixel in the image to identifyobjects, component materials, and/or associated processes in the image,for example. While human eyes view images in terms of red, green, andblue (RGB), hyperspectral imaging divides light into many more frequencybands, in which spectral bands have fine wavelength resolution and covera wide range of wavelengths. In hyperspectral imaging, contiguousspectral bands are measured. Different objects have different spectral“fingerprints” in the electromagnetic spectrum. Hyperspectral signalsevaluate objects using a vast portion of the electromagnetic spectrumand identify those objects.

In hyperspectral imaging, sensors collect information as a set ofimages. Each hyperspectral image represents a certain narrow wavelengthrange of the electromagnetic spectrum (referred to as a spectral band).The set of images together forms a three-dimensional hyperspectral cube(x, y, λ), in which x and y represent two spatial dimensions of thecaptured image environment (e.g., the target eating, sleeping, moving,resting, etc.) and λ represents a spectral dimension defined by a rangeof wavelengths, for example. A detector array or camera can be used tocapture a “snapshot” image of wavelength values for hyperspectralimaging analysis, for example.

Wearable and mobile smartphone technologies can be leveraged in systemsand methods to assist individuals with monitoring, evaluating, andcontrolling their dietary intake. However, there are currently manylimitations for mobile technology used for dietary assessment. Theseissues include that data acquired from phones usually include a largelevel of noise and variance, making acquired audio and/or image datadifficult to accurately and precisely analyze. Furthermore, userinvolvement with these technologies can be burdensome and timeconsuming. Users often must take multiple pictures of their food andupload the images, which is a tedious and involved process. Anotherlimitation of these technologies is that they are only capable ofobtaining images in the visible light range (e.g., RGB). This makes itvery difficult to distinguish caloric composition since two foods mayvisually appear similar but have drastically different nutritionalvalues.

To remedy this problem, certain examples utilize hyperspectral imagingto provide a technological improvement in imaging analysis (and, in someexamples, in place of or to augment audio analysis). Hyperspectralcameras obtain information from the near-infrared spectrum (e.g., 400nm-1100 nm) and also from parts of the ultraviolet spectrum. As aresult, hyperspectral cameras have the potential to differentiatebetween foods that appear similar visually but differ nutritionally andshow promise for use in dietary management. Through feature extractionand classification of hyperspectral images of food, hyperspectralimaging can be used to distinguish food types and to classify foods ofdifferent nutritional value. Further, RGB image databases can beaugmented with hyperspectral databases to improve accuracy in dataanalysis.

Short battery lifetime is a fundamental hurdle that limits the fullpotential of wearables to help gain a better understanding of people'shabits. Certain examples utilize a proximity sensor as a passive sensingtrigger to continuously monitor feeding gestures and trigger anassociated system on and off. The system, once triggered, can capturehyperspectral image information for analysis, for example.

Additionally, sedentary individuals involve different dietary and healthrequirements that an individual who is active and working in manuallabor all day. As a result, physical activity can be monitored such asusing a tri-axial accelerometer.

Certain examples provide a hyperspectral imaging sensor that combinesthree main sensing units and associated processing capability: 1) aproximity sensor to capture feeding gestures and trigger a spectralimaging sensor; 2) a hyperspectral camera (e.g., with diffractive lens,sensor, and image signal processor (ISP)); and 3) a tri-axialaccelerometer to determine physical activity and motion (e.g., toaccount for daily caloric intake/recommendation). The sensor device alsoincludes a signal/control microprocessor, data storage, andcommunication interface (e.g., WiFi, Bluetooth, Bluetooth Low Energy(BLE), near field communication (NFC), etc.), for example.

To maintain battery life, the communication interface can be triggeredwhen the neck-worn device is charging to transmit the data to asmartphone and/or other mobile computing device (e.g., smartphone,tablet computer, laptop, etc.) and then upload the data to a backendsystem (e.g., a laptop computer, a desktop computer, a server, etc.) forfurther processing and analysis. In some examples, depending on theoperating environment, the communication interface can be modified totransmit data continuously to the mobile computing device.

Certain examples can be applied to understand and react toundernutrition, obesity, food allergies, etc. Certain examples moreaccurately and passively detect foods consumed and associated caloricintake without burdening the user through self-reporting.

For example, approximately one in every three children under the age offive is stunted, with many experiencing malnutrition rates well abovethe World Health Organization's emergency threshold. While severalnon-profit agencies are working to strengthen the resilience of ruralpopulations, many areas do not receive assistance either because they donot meet the threshold for emergency aid, or aid is unable to adequatelyreach the region. Certain examples provide technology to help improveunderstanding of the undernutrition problem and help to build asustainable strategy that will help solve the problem.

As another example, the American Medical Association now considersobesity a disease that increases the risk of other chronic diseases.Obesity is prevalent: more than a third of American adults (34.9%) areobese, and two-thirds are overweight or obese. Obesity is also a majordriver of preventable healthcare costs, not only in the United States,but across the globe, including in developing countries. Frequentovereating—intake of excess kilocalories—fosters loss of energy balance,causing obesity. Because there currently exists no objective way todetect overeating in real time, or to predict overeating, there is noknown behavioral intervention to prevent overeating by pre-empting it inthe real time context of risk. Certain examples can help evaluate andaddress obesity.

As another example, more than 50 million Americans have some form offood allergy which affects 4 to 6% of children and 4% of adults (CDC).The following food items contribute to 90% of food allergies: eggs,milk, peanuts, tree nuts, fish, shellfish, wheat, and soy. Certainexamples can detect these food items in an image and can help preventallergic reactions.

As described above, existing technologies primarily use regularred-green-blue (RGB) images and geo-location (to limit the database offood options to select from) to identify food items (in hopes ofdetecting caloric content). However, RGB images are not sufficient todistinguish between different foods, and attempting to arrive at thecomposition of food. In contrast, certain examples can analyze images atvarying spectra to help uniquely identify the food item in the imagefrom captured hyperspectral imaging data.

Certain examples provide new technological solutions to measure food andnutrient intake at an individual and/or group level (e.g., in low incomecountries, etc.) and provide a set of diagnostic tools that can be usedto assess nutrient status at the individual level, etc. Certain examplescontribute to increasing the availability of valid, reliable and timelydata to inform policy and programming at the country level, globallevel, individual level, etc.

Currently, there is no reliable automated method to identify foodsconsumed and their caloric content. Certain examples provide adetermination for every food consumed of a unique food signature that isinvariant to light (e.g., eating in the dark or light), heat (e.g.,whether the food is consumed hot or cold), and slight variations invisual appearance (e.g., the beans consumed may look different withextra sauce).

Certain examples combine passive sensing, such as a long-lasting (e.g.,reduced need to recharge) passive image capture necklace, with foodcomposition tables including newly collected data of local foods andtheir caloric consumption to calculate caloric need to determineundernutrition. For example, a neck-worn hyperspectral videocamera-based system includes a fish-eye lens that detects caloric intaketriggered by feeding gestures from proximity sensors. To determinecaloric need, a tri-axial accelerometer is incorporated to account forenergy expenditure. The tri-axial accelerometer provides a single-pointmonitor for vibration/movement in three planes (e.g., x, y, z) tomeasure movement of a user and provide feedback to translate thatmovement into a measure of energy expenditure.

Certain examples provide algorithms to accurately detect feedinggestures, identify the existence of food in an image, implement imagesegmentation to separate food items, track hand movement (to associatefood consumption with food item), and fuse existing databases withcollected hyperspectral data to map food images to known foods in a fooddatabase (using deep neural networks) to determine caloric content.Certain examples provide methods to fuse spectral signatures to addressmixed foods.

Certain examples provide a near real-time (e.g., by end of day)calculation of caloric need on a smartphone and/or other mobilecomputing device to test a behavioral intervention (e.g., amicro-randomized behavioral intervention, etc.) that detectsundernutrition and inform new programs and/or advance existing programs.

Certain examples provide a highly scalable technology and methodologywhich can be mass-produced and deployed in remote settings. Certainexamples can passively sense and map images to caloric content anddetermine whether someone is undernourished or not.

Description of Certain Example Motion Sensing, Image Capture, andHyperspectral Image Analysis

Despite several efforts in the field of image recognition, conventionalimaging technologies that only acquire morphology are not adequate forthe accurate detection and assessment of characteristics or intrinsicproperties of a product. Spectroscopic imaging covering the visible andnear-IR spectrum (e.g., from 400 nm to 1100 nm) can help identify uniquespectral features that readily discriminate between food items. As anexample, it is challenging to distinguish fat-free from full-fat milkusing a traditional camera, but, as shown in the example of FIG. 1, ahyperspectral camera can provide a unique signature that helpsdistinguish the two types of milk (even in different lighting and heatconditions, after applying a normalization factor, for example). A graph100 in FIG. 1 shows four (4) spectral signatures, in different lightingconditions, separating the fat-free milk from the full-fat milk.However, collecting data across the entire spectrum is not reasonable,and given existing food databases that use traditional RGB cameras,certain examples provide fusion-based systems and methods that combinedatabases to determine caloric intake. Thus, certain examples can usehyperspectral image data alone or in conjunction with RGB image data todetermine food composition and associated caloric intake.

FIG. 2 illustrates an example hyperspectral imaging sensor system 200including a proximity sensor, accelerometer, spectral imaging sensor,lens and zone plate, image signal processor (ISP), microprocessor,battery, communication interface, and data storage. The example system200 can be triggered to analyze an object 210 to generate ahyperspectral image analysis 220, for example.

Certain examples use the hyper-spectral camera of the system 200 torecord the morphological and spectral features of food items to empowermachine learning algorithms to accurately identify the composition andnutrition content of the scanned food items. For example, thehyperspectral camera can include a wide-spectral imager covering visibleand near-IR band from 400-1100 nm. Certain examples image in thenear-infrared (IR) band as many of the meals are being consumed in verylow-light conditions below 5 lux, where cameras operating in the visiblelight spectrum will produce very noisy data. The IR band, with an IRilluminator (automatically triggered when needed), can producehigh-quality morphological and hyper-spectral data unobtrusively, forexample. A convolutional neural network (CNN), recurrent neural network(RNN), and/or other machine learning construct can process the spectralfrequency band information captured in the camera image(s) to correlatea spectral signature with an item (e.g., food, etc.) composition. Basedon the composition, a calorie content can be determined, for example.

Certain examples provide hyper-spectral imaging including recording invideo mode (e.g., collecting image data at a rate of three frames persecond, etc.). For example, a hyper-spectral sensor can include expandedcolor filter arrays (CFA) rather than a RGB configuration in Bayerfilters (e.g., with a 4×4 array of 16 bands of color filtersmanufactured by IMEC, etc.). This approach provides 10-15 nm spectralresolution but may be within a relatively narrow spectral band of onlyIR or visible light. The sensor is able to work with a wide range ofoptics in a very simple and robust package with reduced or minimalalgorithm development, for example.

As another example, a hyper-spectral camera uses a combination ofdiffractive and achromatic lens to record a full 400-1100 nm spectrumusing mobile camera sensors. The camera uses a highly chromatic Fresnelzone plate lens as an objective lens to focus different wavelengths todifferent focus depths, followed by a light-field camera to resolvedepth differences and recover spectral information, for example. Incertain examples, a zone plate encodes spectral information inlongitudinal positions, and the light-field camera decodes the positioninformation into spectral data. This approach provides similar spectralresolution as the above spectral sensor example, but images across thefull spectral range, for example.

In hyperspectral imaging, light and dark reference images can beobtained to calibrate the system to its environment, which may includevarying light intensity. The dark reference image can be used toeliminate dark current effects. The dark reference image can be acquiredby placing a black cap over the camera, for example. The light referenceimage can be obtained by acquiring a white image such as a picture of ablank white sheet of paper. A corrected image can then be calculated as:R=Ro−Rd/Rr−Rd  (Eq. 1),where Ro is an acquired original hyperspectral image, Rr is the lightreference image, Rd is the dark reference image.

The hyperspectral imaging system 200 provides data in the form of adatacube. The datacube stores a multi-dimensional array describing atime series of hyperspectral imaging data. Each dimension of thedatacube represents a different measure, and each cell in the cuberepresents data related to the respective measure. By condensing thespectral information from two dimensions to one dimension, each image isrepresented as a three-dimensional array. The first two dimensions ofthe array represent the x and y spatial dimensions of the image. Thesize of these dimensions varies based on the size of the image taken.The third dimension is the spectral dimension, λ, and represents theintensity at a specific waveband. In certain examples, the spectraldimension has a size of 240 because the imaging system captures 240wavebands of information. For example, an intensity of the thirdwaveband at a particular pixel is the value located in the third indexof this dimension.

Certain examples leverage acquired image information, accelerometerdata, and proximity sensor information to detect foods and/or otheritems (e.g., separately and/or in combination) to recognize the item andits composition. For example, as shown in FIG. 3, the system 200receives input images, accelerometer data, proximity sensor information,etc., and produces energy expenditure information, a feeding gesturetrigger, a detected food source, a food item classification, hand/fingertracking, etc., to provide a caloric need/intake and/or other itemcomposition information. For example, caloric expenditure can begenerated from 3-axis inertial sensors, and an infrared proximity systemcan detect feeding gestures and trigger the hyperspectral camera afterconsecutive feeding gestures have been validated. Given consecutiveimages, food can be identified in a field of view and matched to adatabase, along with food item location, for example. Different fooditems can be segmented out in the field of view and classified at thesource using information from an augmented food database, for example.Hand/finger tracking can be used to associate foods touched by hand withfoods consumed (e.g., using information regarding a subject's averagefinger (e.g., thumb, index finger, etc.) and nail dimensions, etc.).Outputs can be combined to detect caloric need at a certain confidenceinterval, which can then be used to determine a condition such aswhether the user is undernourished, overnourished, allergic, etc.

In certain examples, foods, activities, and/or other items can bedetected and classified using deep neural networks (DNNs). For example,convolutional neural networks (CNNs), recursive neural networks (RNNs),and/or other deep learning networks can be used to detect and classifyitems and distinguish items over time. For example, Long Term ShortMemory (LTSM) RNNs can properly distinguish between class labels usingspatiotemporal data (e.g., video, etc.).

In certain examples, a mobile computing device, such as a smartphone,tablet, etc., executes an application that acts as an informationgateway to transmit data from the device to a backend server (e.g.,triggered when charging the device). The application can also providecommunication and feedback to the user via the mobile device, forexample.

In certain examples, a mobile hyperspectral camera system 200 candetermine dietary intake, monitor food quality and environmentalcontamination, provide feedback to a user, etc. Certain examples workwith and generate a food database that maps food items to individualcaloric information, etc., based on information from the camera system200.

In certain examples, before features can be extracted from capturedimage data, the image data is preprocessed. For example, preprocessingcan include cleaning, dimension reduction, and path selection. Forexample, during the cleaning stage, the first 30 channels (e.g., 360 nmto 480 nm) of data are discarded and/or otherwise excluded as includinga large amount of noise. To reduce the size of the data, the remaining210 channels can be merged into seven larger bands by calculating themean for every 30 bands. Then, twenty-four 30-pixel-by-30-pixel patchescan be selected from each waveband to enhance the data set.

Features can be extracted from the preprocessed data set. For example,if fifteen different food dishes are present, with eleven dishes unique,some food items include multiple macromolecules (e.g., cookies includefat and sugar, etc.). When dishes appear visually different, featuresfrom the visible range of the spectra can be used to distinguish onedish from another. Features can include mean, standard deviation,maximum and minimum values, for example. The spectrum can include sevenbands (e.g., including the visible range of the spectrum in the firstthree of the seven bands), resulting 3×4=12 total features in oneexample.

Macromolecule content detection can also be facilitated. First, obtainedimages are divided into training and testing datasets based on thelighting conditions of each image. Images taken in high and lowintensity lighting conditions can be used as training data, for example,and images taken in medium intensity conditions can be used for testingthe trained system (e.g., trained neural network such as a CNN, RNN,etc., a random forest, etc.). In certain examples, food items can havenearly identical visual appearances but differ in their molecularcontent (e.g., more salt content, less salt content, etc.). Thus,information from more than the visible range of the spectra is used todistinguish macromolecule content.

Prior to extracting features, the image data can be preprocessed. Forexample, each image can be cropped so that any part of the image notincluding food is removed. Further, the first 30 bands of data may benoisy and, therefore, removed. To reduce the large size of the dataset,a principal component analysis can be applied. Principal componentanalysis (PCA) is a statistical procedure that uses an orthogonaltransformation to convert a set of observations of possibly correlatedvariables into a set of values of linearly uncorrelated variables calledprincipal components. The number of principal components is less than orequal to the number of original variables.

The PCA can be used in hyperspectral imaging for dimension reduction andextracting features. The PCA functions by creating uncorrelated(orthogonal) components from correlated data. This allows for the use ofonly a few components to explain most of the variance within the data.Since only a few components need to be used, the PCA enables a muchquicker analysis. For example, the first k principal components whichincorporate 99% of the variance in the image data can be used.

After cleaning and reducing the data, training and testing data sets canbe created based on lighting conditions. From these images, patches(e.g., 15×15×k, 30×30×k, etc.) can be extracted. A size of the patch canbe dependent on an initial size of the image, for example. A meanspectrum of pixels in each patch can form a feature vector.

Classification methods can be applied to the data sets to distinguishdifferent foods and their macromolecule content. Metrics used toevaluate the classifications include accuracy, F-measure, and confusionmatrix. The F-measure is a harmonic mean of recall and precision. TheF-measure includes a parameter that determines a trade-off betweenrecall and precision. A standard F-measure is F1, which provides equalimportance to recall and precision, for example. The confusion matrix,also referred to as an error matrix, provides a visualization, via atable, of performance of an algorithm such as image data processing.

In certain examples, a Radial Basis Function (RBF) kernel Support VectorMachine (SVM) can be used to classify foods based on their macromoleculecontent. The SVM provides a supervised learning model and associatedlearning algorithm to analyze data for classification and regressionanalysis. Given a set of classified training examples, the SVM builds amodel to classify new examples according to the model of the alreadyclassified examples. The SVM classifies data by finding an optimalhyperplane to separate sets of data. However, the data cannot always beseparated linearly, so an RBF kernel trick maps the data into a higherdimension to make classification easier. An RBF is a real-valuedfunction whose value depends on a distance from an origin or center. Foran RBF kernel, the kernel function isK(x _(i) ,x _(j))=exp(−γ)∥x _(i) −x _(j)∥₂ ²  (Eq. 2),wherein optimal values for the classifier are determined via n-foldcross validation. In Equation 2, an RBF kernel function for two centervalues x_(i) and x_(y), is defined by an exponential relationshipbetween x_(i) and x_(y) as modified by gamma, γ, which quantifies areach or span of influence for a single training example. Low gammavalues indicate the influence is far, and high gamma values indicate theinfluence is close. The gamma value can be determined as an inverse ofradius of influence of samples selected by the model as support vectors.After the optimal values are determined, the SVM classifiers are trainedwith these parameters and then tested on the testing data.

FIG. 4 illustrates further detail regarding the example hyperspectralimaging sensor system 200. As shown in the example of FIG. 4, the system200 includes a video camera trigger 410, camera 420, image calibrator430, food quantifier 440, food identifier 450, energy intake calculator460, energy expenditure calculator 470, and a data processor 480 toprocess the generated content and calculations to determine a dietaryand/or other status of the user, such as the user over-ate 492,under-ate 494, detected an allergen 496, etc. As described furtherbelow, the example system 200 of FIG. 4 captures and analyzes, based onan activation trigger, hyperspectral and/or other image data todetermine energy intake versus energy expenditure for a targetindividual or group of individuals and, based on the energy evaluation,determine a status of the monitored target individual(s) (e.g., theperson overate 492, underrate 494, has been exposed to an allergen 496,etc.). To maintain privacy and prolong battery life, the camera 420 istriggered 410 on and off such that the camera 420 is active andcapturing data when the target is eating or drinking and is otherwiseoff.

FIG. 5 depicts an example implementation of the video camera trigger410. Using the video camera trigger 410, battery life/energy usage canbe conserved, and the camera 420 can be triggered when needed but leftotherwise in an idle/sleep/power down mode, for example. The exampletrigger 410 includes emitter(s) 510 generating a signal 520 detected bya receiver 530 for analysis by a returned signal analyzer 540. Signalanalysis by the signal analyzer 540 includes preprocessing by apreprocessor 550 and a feature extractor 560 to be supplied to aclassifier 570. A resulting output from the classifier 570 forms atrigger for the video camera 420 to trigger a recording of video for atime period (e.g., X seconds such as 5 seconds, 10 seconds, 20 seconds,30 seconds, 60 seconds, etc.).

For example, a proximity sensor can be used with a RFduino board totrigger the video camera 420. The proximity sensor works based onoptical reflection, for example. An optical proximity sensor includes alight source and a sensor that detects the light. The light sourcegenerates light of a frequency that the light sensor is best able todetect, and that is not likely to be generated by other nearby sources(such as IR light, etc.). The light sensor circuit is designed so thatlight which is not pulsing at this frequency is rejected. The lightsensor in the optical proximity sensor is a semiconductor, for example,which generates a small current when light energy strikes it. Once anobject approaches the proximity sensor, there will be a peak in thetrend of data. Thus, the data trend can be used, followed bypreprocessing, segmentation (e.g., through energy peak detection),feature extraction and classification to validly distinguish truetriggers from false triggers that may result from proximity to otherobjects. There are several sources for false triggers: some are dynamictriggers (e.g., a person scratching their head or face) others arestatic triggers (e.g., a person gets too close to the table or anobject). Certain examples approach this as a two-class problem, withclass 1 representing feeding gestures (or hand-to-mouth gestures), andall other gestures are represented by class 2.

Thus, in certain examples, gestures such as eating gestures, etc.,and/or other movements such as swallowing, chewing, chest movement,etc., can be identified by a sensor and used to trigger operation (e.g.,recording, etc.) of the video and/or still camera 420. Feeding gesturescan include inertial, proximity, and/or imaging-based feeding gesture,for example. Swallow indicators can include acoustic, mechanical,electrical, and/or imaging detectors of swallowing, for example. Chewingdetectors can include acoustic, mechanical, and/or imaging detectors ofchewing, for example. A piezoelectric sensor can detect swallowing,chewing, etc., for example. Heart rate and heart rate variability (HRV)can also be used as a detectable trigger/feature based onelectrocardiogram (ECG) chest information, light emitting diode (LED)and/or photodiode detection, imaging, etc. In certain examples, agalvanic skin response (GSR) (e.g., sweat, etc.) as detected byelectrical monitoring can serve as an indicator or trigger for activityto be monitored/capture. Additionally, a determination of whether amonitored individual is alone or with others can be determined viaacoustic data capture, imaging data capture, etc. In certain examples,both hyperspectral and RGB imaging data can be captured for analysis viathe camera 420.

The example preprocessor 550 includes performing time stamp correction,which identifies losses in the data stream. Using dynamic-programming(e.g., a time-series alignment algorithm), knowing the ideal time stamps(e.g., given the expected sampling frequency of the signal), the actualtimestamps can then be aligned properly to the ideal location. If asmall amount of data is lost, interpolation (e.g., using splineinterpolation) can handle missing signals. Then, data preprocessinghelps ensure that the inertial signals' intended measurements werecaptured, primarily by smoothing to reduce noise and normalization. Thepremise of smoothing data is that one is measuring a variable that isboth slowly varying and corrupted by random noise. Consequently,replacing each data point with an average of surrounding points reducesthe level of noise, while hopefully not biasing the values. To smooththe data, a rolling mean (e.g., pandas.rolling_mean, etc.) with a windowsize of 100 points (e.g., approximately 3 seconds, set empirically,etc.) is applied. Then, the data is normalized using a z-scorenormalization technique, for example.

To segment the data, an energy signal peak detection algorithm isapplied to signal whenever an object is in the field of view of thesensor. A two second window is extracted surrounding the peak, forexample. Following segmentation, feature extraction 560 is applied.

Image data can be segmented to enable mapping of segmented objects in animage to images in a food database (e.g., by the food quantifier 440and/or food identifier 450). Over-segmentation (e.g., a bag of chips isdivided into two segments) and under-segmentation (e.g., the bag ofchips is combined with a glass of water in one segment) can beprevented. Segmentation can be optimized and/or otherwise improved toaccurately segment foods in the image using a joint iterativesegmentation/classification technique in which a classifier's feedbackincluding class label and confidence score is used to help ensuresegmentation is stable. In certain examples, background pixels in animage can be distinguished from foreground pixels (e.g. using Cannyoperator) to detect plates, bowls, glasses, etc., and identify objectsof interest. A Convolutional Neural Network (CNN) can be applied forspatial modeling and a Long Short Term Memory (LSTM) Recurrent NeuralNetwork (RNN) can be used for temporal modeling to optimize or otherwiseimprove this approach.

Following data preprocessing 550 of the raw signal, features areidentified and extracted to determine for which features to collect rawsignal to predict an outcome. Due to the high variability across signalsthat represent the same activity and to help ensure that the system iscapable of running in real-time, certain examples extract 11 statisticalfeatures on fixed time subdivisions of the data that are known to beuseful in detecting activity, including: mean, median, max, min,standard deviation, kurtosis, interquartile range, quartile 1, quartile3, skewness, and root mean square (RMS), for example. Following featureextraction 560, a classification technique 570 is deployed.

Once the optimal feature subset and signal processing algorithms areidentified, one or more of a varied set of classification algorithms canbe used including Logistic Regression (LogisticReg), AdaBoostClassifier(AdaBoost), C4.5 Decision Trees (DecisionTree), Gaussian Naive Bayes(GaussianNB), Linear Support Vector Classifier (LinearSVC), and RandomForest (RF) with n=100 trees. Both LOSOCV and 10-fold CV (averaged 10times) can be used in certain examples. In certain examples, variabilityacross the different runs can be calculated to help ensure that not onlyis the classifier with the highest F-measure (harmonic mean of precisionand recall) selected, but also the classifier with the lowestvariability (e.g., highest consistency) across the runs. This helps toshow the consistency of the classifiers when tested on differentsubjects or a different test set. In certain examples, the Random ForestClassifier outperforms other algorithms in accurately distinguishinghand-to-mouth gestures from other objects in the field of view.

FIG. 6 depicts an example implementation of the image calibrator 430.The image calibrator 430 is to correct lens distortion, measure size ofpixels in real world units, etc. In image calibration 430, a normalizedimage 602 is processed by the image calibrator 430 to extract key points604. The key points 604 are processed to extract Scale-Invariant FeatureTransform (SIFT) features 606, for example. The SIFT features are thenused to register image frames 608 (e.g., frames from the acquired videoimage content, etc.).

Frame registration information 608 and physical characteristics of anaverage individual 610 are provided to capture a hand identifier 612associated with the image data. The hand identifier 612 includes afeature extractor 614 to extract features for a hand detector 616, whichdetects the hand in the image data and generates a unique handidentifier 618, for example. The hand identifier 618 is provided with alens model 620 to a parameter estimator 622 to estimate parameters oflens and image sensor. The parameter estimator 622 provides output(s)including a lens distortion correction 624, a pixel size measurement626, etc.

In certain examples, a hyperspectral signature is formed from thecaptured image data. Rather than storing all parts of the spatial andspectral domains, a subset can be identified and stored. In certainexamples, raw images are stored in a database, and unique hyperspectralsignatures are used for fast database querying and indexing.

FIG. 7 illustrates an example implementation of the food quantifier 440.The example food quantification 440 quantifies each food item consumed,for example. The quantifier 440 takes the pixel size measure 626 andgenerates a calibrated set of images 702. The images 702 are provided toa deep learning network 704, such as a LSTM RNN deep learning network,etc. The network 704 uses the calibrated set of images 702 and itsnetwork of nodes and connections to form connections and develop atrained database, for example. Once trained, the network 704 can bedeployed to identify locations of foods (e.g., on plate, table, etc.)706, determine quantity 708, etc.

In certain examples, a small, colored checkerboard pattern is placed inthe field of view of the camera 420 (e.g., attached to the bottom of thefield of view of the camera 420 with respect to the part that shows thebody of the wearer, etc.) to aid in calibrating for size and color ofobjects in a captured scene. When mapping the small checkerboard patternwith the size of an individual's thumbnail, the image data can becalibrated to determine a quantity of food near the individual's hand.The color in the patterns helps ensure the images can be normalized forthe system 200 system to function under a plurality of levels ofillumination (e.g., light or dark).

For example, hyperspectral sensors collect information as a set ofimages. The images with the essential spectral bands are extracted andfed into a recurrent neural network (RNN) model 704, which is designedspecifically for sequential data. RNN models 704 can be trained on oneset of input sequence, and then generalized to a different length testsequence, for example. The RNN 704 achieves this property through theinclusion of cycles in its computation graph, as well as sharing ofparameters across time, for example.

A particular implementation of the RNN 704 is utilized to capturelocations of food 706 and to determine the quantity of food 708 from thecontinuous data stream of the video feed, for example. First, supervisedsequence labeling models are used from existing image data usingHyperspectral Imaging, where the output sequence has the same length asthe input sequence. Second, long short-term memory (LSTM) is used as therecurrent layer to avoid the vanishing gradient problem common whenapplying RNN.

LSTM has been employed successfully in many applications related tosequential data such as speech recognition, video action recognition,and wearable action recognition. A complicated model can be built withmultiple recurrent and constitutional layers. The neural network modelSwallowNet is designed to have a single, but most likely several,recurrent layers combined with one nonlinear transformation layer forfeature extraction, for example.

In certain examples, the data stream is split into chunks representingeach essential spectral band. Each image is then split into chunksrepresenting portions of the image for the system to learn whether thesechunks from this band represent food or not. These chunks are thentransformed through a nonlinear embedding layer which resembles featureextraction, for example. The network learns the optimal representationto differentiate between food and no food.

Features are then fed into an LSTM layer to learn the temporal dynamicsof the signals. An LSTM layer has internal state and output, which areupdated recurrently throughout the sequence. LSTM utilizes forget gates,input gates and output gates to implement this update, for example.

From these gates, the internal states and output can be obtained.Outputs from LSTM layers are transformed through another lineartransformation layer to obtain two dimensional outputs for each chunk,for example. The loss of the network is then calculated as thecross-entropy loss between ground truth and the soft-max activation ofthe output layer summed over the whole sequence, for example.

The RNN 704 can be trained on images from each spectral band, forexample. If not enough data is present, data augmentation can beperformed from existing data. At each iteration, images and theircorresponding labels are fed into the optimization. In certain examples,to increase the training set, data augmentation is used by scaling thesequences by a random number between 0.8 and 1.2. This range is selectedempirically to introduce realistic noise into the data, while notdrastically distorting signal shape.

A dimension of the embedding layer is selected to compress the originaldata. The dimension of LSTM layer is set. The network 704 is trainedusing the Adam optimization algorithm with a learning rate of 1e-3, forexample. A number of training iterations is fixed throughout theprocess. A backpropagation through time algorithm updates both featurerepresentation and LSTM weights at each optimization iteration, insteadof training each layer separately, for example.

FIG. 8 illustrates an example implementation of the food identifier 450.The food identification 450 is to determine a probability of each fooditem on a monitored plate, for example. The food identification 450 canbe based on a sparse representation of data, for example.

As shown in the example of FIG. 8, a dictionary 802 forms a traineddatabase 804 based on characteristics spectrally identifying one or morefood ideas. The food item characteristics are processed with respect tomeasured feature extraction 806 information and provided to a linearcombination solver 808. The solver 808 solves for X to discover acombination of food items 810 present in the image field of view, forexample.

Machine learning algorithms such as a Random Forest classifier, etc.,can be used to classify foods. In certain examples, such algorithms areoptimized or otherwise improved further using sparse signalrepresentation to obtain compact high-fidelity representation of theobserved food item and to extract semantic information using uniquesignatures across multiple wavelengths of the spectrum. In certainexamples, hyperspectral images are combined with existing RGB imagerydatabases. In certain examples, such databases are enriched withspectral and audio signatures. Unknown food items can be detected earlyin the development phase to populate and advance food databases with newfoods, for example.

FIG. 9 illustrates an example implementation of the energy intakecalculator 460 to determine calories consumed from a meal, bites, etc.As shown in the example of FIG. 9, a probability of each food item 902and a history of personalized food preference 904 are provided with anindication of food detected 810, food quantity 708, and a food quantityerror 906 to a caloric intake estimator 908. The caloric intakeestimator 908 processes the food information 708, 810 902, 904, 906 andleverages a calorie database 910 to generate a caloric intake value 912.

In certain examples, the caloric intake estimator 908 takes into accountfoods detected 810 and food quantity 708, along with food quantity error906, probability of each food item 902, and history of personalized foodpreference (e.g., what foods does this individual eat/not eat, etc.)904, and input from the calorie database 910 and generates the Caloricintake estimate 912 with a confidence interval (CI).

In certain examples, the energy intake calculator 460 determines whichparts of the spectrum to use to improve/optimize determination ofnutrients and caloric content. In addition to distinguishing betweencalorie levels, the energy intake calculator 460 can determine whetherfood content is low/med/high salt, sugar, fat, protein, and/orcarbohydrate, etc. Each spectrum provides unique information that willenable the system 200 to adjust levels based on the food type identified(e.g., by the food identifier 450).

In certain examples, an inertial motion unit (IMU) can be used in frameregistration to help align hyperspectral frames across time during usermovement.

In certain examples, the energy intake calculator 460 implements anenergy intake algorithm to combine each identified food item andassociated confidence rating and quantity to estimate a number ofcalories of each food segment identified on each plate. A first type ofalgorithm identifies a number of calories at the beginning and end ofthe meal, which helps determine foods consumed during the entire meal. Asecond type of algorithm tracks foods that move from the hand to themouth and estimates calories accordingly.

FIG. 10 illustrates an example implementation of the energy expenditurecalculator 470. The energy expenditure calculator 470 determines caloricexpenditure by calculating a metabolic equivalent of task(s) usinginertial body sensors, such as a triaxial accelerometer, gyroscope,magnetometer, etc. As shown in the example of FIG. 10, inertialsensor(s) 1002 provide position/motion information with image data forpreprocessing 1004 and segmentation 1006. From the preprocessed andsegmented information, feature extraction 1008 can generate features forclassification 1010. Classification 1010 generates an indication ofactivity 1012 and an intensity 1014 associated with that activity 1012.Activity 1012 and intensity 1014 information can be combined withdemographic information such as age 1016, gender 1018, and body massindex (BMI) 1020 to form a Metabolic Equivalent of Task (MET) mapping1022. The MET, or metabolic equivalent, is a physiological measurequantifying an energy cost of physical activities defined as a ratio ofa) metabolic rate (a rate of energy consumption) during a specificphysical activity to b) a reference metabolic rate. The MET mapping 1022is provided to the data processor 480 with caloric intake estimationinformation 912 to generate an outcome 490-494.

Once the food has been identified and energy intake and expenditure havebeen calculated, a state of the monitored individual can be determined.For example, a histogram of food consumed for 1-2 meals can bedetermined and compared to a 1 z-score cutoff or threshold to evaluatewhether the target is overeating (e.g., over the threshold), undereating(e.g., under the threshold), etc. For example, salt intake can beestimated based on food identification and energy intake. Foodidentification can be used to determine whether the monitored, targetindividual ate a snack versus a full meal and whether the individual hasone overeating meal in a day versus multiple overeating. A meal plan canbe adjusted, prescription issued, and/or other intervention triggered ifthe analysis indicates a pattern of overeating, undereating, etc.

The z-score indicates a distance of a number from the mean of a normallydistributed data set. A 1 z-score is within one standard deviation ofthe mean. Thus, if the mean defines a “normal” meal, then acaloric/energy outcome that is more than one standard deviation abovethe mean indicates overeating, and an outcome that is more than onestandard deviation below the mean indicates undereating. Thus, if aparticipant p consumes n eating episodes (e.g., meals/snacks) for 14days, then a personalized caloric distribution model is created andconverted to a standard normal distribution. For an eating episode iduring day d, E_(i) ^(d)(x) is considered overeating if the determinedcaloric intake Ci(x)>1 z-score (e.g., 85.1% percentile, etc.) thresholdof participant p's personalized caloric intake distribution, whichindicates larger than normal meals typically consumed by theparticipant.

FIG. 11 illustrates a flow diagram of an example method 1100 ofhyperspectral image analysis using the example system 200. At block1102, image acquisition is triggered. For example, a sensor can trigger410 acquisition by a video and/or other imaging camera 420. At block1104, acquired image data is calibrated 430. At block 1106, fooddetected in the image data is quantified 440, and, at block 1108, thefood is identified 450. At block 1110, energy intake is calculated 460.At block 1112, energy expenditure 470 is calculated. At block 1114, anevaluation is generated 480 based on the energy intake calculation 460and the energy expenditure calculation 470. Output can include anevaluation of over-eating 490, under-eating 492, presence of allergen494, etc.

FIG. 12 is a block diagram of an example processor platform 1200 capableof executing instructions to implement the example apparatus, systemsand methods disclosed and described herein. The processor platform 1200can be, for example, a server, a personal computer, a mobile device(e.g., a cell phone, a smart phone, a tablet such as an IPAD™) apersonal digital assistant (PDA), an Internet appliance, or any othertype of computing device.

The processor platform 1200 of the illustrated example includes aprocessor 1212. Processor 1212 of the illustrated example is hardware.For example, processor 1212 can be implemented by one or more integratedcircuits, logic circuits, microprocessors or controllers from anydesired family or manufacturer.

Processor 1212 of the illustrated example includes a local memory 1213(e.g., a cache). Processor 1212 of the illustrated example is incommunication with a main memory including a volatile memory 1214 and anon-volatile memory 1216 via a bus 1218. Volatile memory 1214 can beimplemented by Synchronous Dynamic Random Access Memory (SDRAM), DynamicRandom Access Memory (DRAM), RAMBUS Dynamic Random Access Memory (RDRAM)and/or any other type of random access memory device. The non-volatilememory 1216 can be implemented by flash memory and/or any other desiredtype of memory device. Access to main memory 1214, 1216 is controlled bya memory controller. The processor 1212, alone or in conjunction withthe memory 1213, can be used to implement all or part of the apparatus,systems, and/or methods disclosed herein.

Processor platform 1200 of the illustrated example also includes aninterface circuit 1220. Interface circuit 1220 can be implemented by anytype of interface standard, such as an Ethernet interface, a universalserial bus (USB), and/or a PCI express interface.

In the illustrated example, one or more input devices 1222 are connectedto the interface circuit 1220. Input device(s) 1222 permit(s) a user toenter data and commands into processor 1212. The input device(s) 1222can be implemented by, for example, an audio sensor, a microphone, acamera (still or video), a keyboard, a button, a mouse, a touchscreen, atrack-pad, a trackball, isopoint and/or a voice recognition system.

One or more output devices 1224 are also connected to interface circuit1220 of the illustrated example. Output devices 1224 can be implemented,for example, by display devices (e.g., a light emitting diode (LED), anorganic light emitting diode (OLED), a liquid crystal display, a cathoderay tube display (CRT), a touchscreen, a tactile output device, a lightemitting diode (LED), a printer and/or speakers). Interface circuit 1220of the illustrated example, thus, typically includes a graphics drivercard, a graphics driver chip or a graphics driver processor.

Interface circuit 1220 of the illustrated example also includes acommunication device such as a transmitter, a receiver, a transceiver, amodem and/or network interface card to facilitate exchange of data withexternal machines (e.g., computing devices of any kind) via a network1226 (e.g., an Ethernet connection, a digital subscriber line (DSL), atelephone line, coaxial cable, a cellular telephone system, etc.).

Processor platform 1200 of the illustrated example also includes one ormore mass storage devices 1228 for storing software and/or data.Examples of such mass storage devices 1228 include floppy disk drives,hard drive disks, compact disk drives, Blu-ray disk drives, RAIDsystems, and digital versatile disk (DVD) drives.

Coded instructions 1232 associated with any of the examples disclosedand described herein can be stored in mass storage device 1228, involatile memory 1214, in the non-volatile memory 1216, and/or on aremovable tangible computer readable storage medium such as a CD or DVD.

It may be noted that operations performed by the processor platform 1200(e.g., operations corresponding to process flows or methods discussedherein, or aspects thereof) may be sufficiently complex that theoperations may not be performed by a human being within a reasonabletime period.

As shown in FIG. 13, a snapshot imager 1300 can be mounted on anecklace, strap, etc. 1310 to gather, process, and transmit imaging andother sensor data for caloric and/or other energy analysis. The snapshotimager 1300 is a nutrient detector that captures visible (e.g., 470-630nm) and/or near infrared (e.g., 600-1000 nm) light in 16 visual bandsand/or 25 near infrared bands, for example. The example snapshot imager1300, like the camera 420, provides an imager, lens, battery/power, anddata transfer connection to capture image data and transfer data to therest of the system 400 (e.g., to the image calibrator 430 to beprocessed and provided to the food quantifier 440 and the foodidentifier 450) for processing and analysis. The imager 1300 canintegrate spectral filters ‘per pixel’ monolithically on top of imagewafers in a mosaic pattern, for example. The mosaic pattern allowsmultispectral imaging at video rates in a small form factor. Particularspectral bands can be identified for use in image data capture, forexample.

In certain examples, the snapshot imager 1300 can include a microphoneto establish a unique signature for each food to complement its spectralsignature. For example, a beamforming microphone array can be used withbackground noise cancellation to help ensure high quality in noisyenvironment. The camera, microphone, and/or other sensor(s) can beintegrated on one or more printed circuit boards, processors, etc., inthe imager 1300, for example.

The example snapshot imager 1300 can be part of an architecture 1400including a mobile application 1410 communicating with a cloud-basedinfrastructure 1420, such as shown in the example of FIG. 14. Theexample snapshot imager 1300 is implemented as a wearable deviceincluding a sensor system 1402 (e.g., a camera, microphone, proximitysensor, etc.), a microcontroller or other processor 1404, a BluetoothLow Energy (BLE) and/or other communication interface 1406, and abattery (e.g., a low profile battery, etc.) 1408.

The wearable sensor/imaging device 1300 communicates (e.g., viaBluetooth, WiFi, Near Field Communication (NFC), etc.) with a mobileapplication 1410 to process the acquired image and/or other sensor datafrom a target user. The mobile application 1410 includes a plurality oflayered algorithms 1412 to process the image/sensor data includingsensor data fusion, feature analysis, feature extraction, featureselection, and classification. The mobile application 1410 also includesa user interface 1414 to allow a user to interact with the mobileapplication 1410 and associated functionality 1412, data, etc. Themobile application 1410 also includes a cloud interface 1416 to senddata to and/or request data from the cloud-based infrastructure 1420.The mobile application further includes third party integration 1418 forone or more third party applications, platforms, etc.

The cloud-based infrastructure or platform 1420 includes secure datastorage 1422 to store data from the mobile application 1410 and/oraggregate that data with data from other sources (e.g., other wearabledevice(s) 1300, information system(s), etc.). The cloud infrastructure1420 includes a Web service 1424 to display and provide interaction withprocessed data, etc., via the Web. The cloud infrastructure 1420 alsoincludes one or more metrics 1426 computed and stored to evaluateindividual, group, and/or other aggregate target behavior, intervention,etc.

In the example of FIG. 14, sensor(s) 1402 transmit data to themicrocontroller board 1404, which filters the data and uses theradiofrequency (RF) (e.g., BLE, etc.) transceiver to transmit the datato the mobile platform 1410. The mobile platform 1410 fuses data fromthe sensors and extracts one or more features. A subset of features isused to classify food type, associated calories, etc. Feedback can thenbe provided to a user.

Thus, certain examples provide wearable sensors (e.g., neck-worn camera,etc.) and associated processing systems to determine caloric andnutrient intake. Certain examples enable monitoring of an individual,group, population, etc. Certain examples determine caloric and nutrientintake in a particular patient/user context and provide feedback.

In certain examples, chewing can be detected using a proximity sensorarranged in the imager 1300 near a target user. Additionally, an IMU canbe provided in the device 1300 that identifies leaning of the targetuser and extrapolates whether the user is leaning to eat or drink (e.g.,chew, swallow, etc.). Image data can be segmented and candidatesselected using a linear-time approximation algorithm, for example. Incertain examples, features from three signals, gradient boosting, andtime point fusion enable capture of eating episodes for analysis. Thus,the example device 1300 can include an IMU, light sensor, proximitysensor, as well as a camera to capture information from a target andrelay the information for processing, for example.

FIG. 15A illustrates example food samples that can be imaged andanalyzed for caloric content and composition according to the apparatus,systems, methods, etc., described above. FIG. 15B shows an averagespectrum for three kinds of food items under the same light conditionand fat content. As demonstrated by the example of FIG. 15B, the fooditems can be differentiated based on their associated spectrumcharacteristics. FIGS. 15C and 15D illustrate spectra associated withstew having medium (FIG. 15C) and high (FIG. 15D) fat images in threedifferent light conditions. Differences in spectra can be identifiedbased on the intensity of light.

FIG. 16 illustrates a table showing results of food calorie contentidentification using RGB and hyper-spectral image analysis. As shown inthe example table of FIG. 16, hyperspectral image analysis producesimproved accuracy with better precision and recall compared to RGB imageanalysis. Particularly with yams (increase from 65.83% to 81.66%) andstew (increase from 54.44% to 82.77%), the hyperspectral image analysisdemonstrates significantly improved accuracy. Precision and recall aresimilarly improved.

Recall (sensitivity) is a likelihood that an item will be correctlyidentified (e.g., a likelihood that a low-sugar banana dish will becorrectly identified as containing a low level of sugar, etc.).Precision is a probability that a dish identified with a certaincharacteristic (e.g., a dish identified as containing low sugar, etc.)actually has that characteristic (e.g., actually contains a low level ofsugar).

FIG. 17 shows an example confusion matrix including labels classifiedusing one or more SVMs. FIG. 17 provides true labels, classified labels,sensitivity/recall, and associated precision and accuracy, along with anF1 score. False negatives are samples which were incorrectly identifiedas belonging to a different sugar level. Since this is not binaryclassification in a 2×2 matrix there is no false positive metric. The F1Score is a harmonic mean between precision and recall, and accuracy is anumber of correctly identified samples divided by the total number ofsamples.

FIGS. 18A-18B show example confusion matrices for hyperspectral (FIG.18A) and RGB (FIG. 18B) image data and analysis. As indicated on the farright of each matrix, accuracy jumps from 47% using RGB data analysis to91% accuracy using hyperspectral data analysis.

FIG. 19 shows an example of F-measures obtained for classification offat in eggs using hyperspectral analysis alone, RGB analysis alone, or afused hyperspectral/RGB analysis. Thus, hyperspectral image analysis,alone or in combination with RGB image data analysis, enables foodidentification and determination of associated caloric content. A targetindividual's caloric intake can then be determined and compared to theircaloric expenditure. While RGB narrows colors to a single red, green, orblue value, hyperspectral imaging captures an intensity level of allwavelengths in a spectrum. Thus, hyperspectral imaging of RGB image datacan provide a range of reds, greens, and blues, for example. RGBinformation can be augmented with hyperspectral image data to provideimproved results.

FIG. 20 illustrates the systems and methods described above applied to achewing and eating detection framework 2000. As shown in the example ofFIG. 20, one or more signals such as a proximity signal, energy signal,ambient light signal, and lean forward angle (LFA) signal are acquiredfrom the example device 1300. The LFA can be calculated by applying adot product of the normal vectors of two planes to define the LFA as theangle between the IMU and Earth's surface:LFA=a cos<n1,n2>  (Eq. 4),where the normal vector of the Earth's surface is the z-axis and thenormal vector of the IMU is obtained through quaternion transformation:n1=[0,0,1] n2=qn1q ⁻¹  (Eq. 5),where q is a unit quaternion that rotates n1 to obtain the normal vectorof the IMU. The IMU can also provide a triaxial accelerometer (ax, ay,az) capturing acceleration from three axes to calculate an energy sum asthe sum of squares of the accelerometer:E=a _(x) ² +a _(y) ² +a _(z) ²  (Eq. 6).Chewing and/or other environmental context trigger can be used totrigger image data capture and/or correlate captured image data todetermined user events (e.g., chewing, swallowing, breathing, etc.).Errors can be determined (e.g., based on time stamp, period, etc.) aswell to help ensure an erroneous trigger does not result in capture oferroneous/inapplicable data. Features can be extracted includingmaximum, minimum, mean, median, variance, root mean square (RMS),correlation, skew, kurtosis, first and third interquartile range,interquartile range, etc. Extracted features can be gradient boostedand/or otherwise processed to select certain features, clean up featuredata, etc., for analysis. In certain examples, features can be gathered,analyzed, fused, etc., over time.

As shown in the example of FIG. 20, captured signal information can belabeled to identify meals and chewing, track a food log, etc.Information can then be segmented based on a wear time filter for thedevice 1300, analysis of prominent peaks in acquired signal information,examination of period, etc. Features are extracted from the segmentedinformation to identify statistical features, frequency-based features,periodic subsequence features, etc. Then features can be classified suchas using gradient boosting, time-point fusion, etc., to better preparethe features for classification and application to food content andconsumption analysis.

Thus, certain examples provide hyperspectral imaging analysis todistinguish foods and identify different levels of macromolecule contentin those foods. To detect different food types, features can begenerated from a variety of spectra (e.g., visible spectrum, nearinfrared spectrum, etc.), and improved accuracy can be achieved.Further, different levels of macromolecules can be detected using PCA toreduce dimensionality and employing an SVM classifier with an RBFkernel. The PCA plus SVM process can be applied to the entire spectrumrather than only RGB channels. Using a larger range of wavelengths isuseful in more accurately determining the nutritional content in a food,especially when two samples look the same but are different in theirmacromolecule contents. In addition, in some instances, fusing theresults of hyperspectral and RGB classifiers can result in the greatestpredictive power. This supports the notion that current databases offood images should be expanded to include hyperspectral images in orderto increase the efficiency of automated image-based calorie detection.In certain examples, rather than or in addition to classifyingmacromolecule content, a regression model can be implemented to predictan amount of macromolecule present in samples. Thus, hyperspectraltechnology can be used to estimate the nutritional content of foods.

Certain examples provide systems, apparatus, methods, and computerreadable storage media for hyperspectral analysis and item compositionevaluation to classify a target.

An example hyperspectral imaging sensor system to identify itemcomposition includes an imager to capture hyperspectral imaging data ofone or more items with respect to a target. The example includes asensor to be positioned with respect to the target to trigger capture ofthe image data by the imager based on a characteristic of the target.The example includes a processor to prepare the captured imaging datafor analysis to at least: identify the one or more items; determinecomposition of the one or more items; calculate an energy intakeassociated with the one or more items; and classify the target based onthe energy intake.

In certain examples, the one or more items include food items. Incertain examples, the characteristic of the target includes at least oneof chewing or swallowing. In certain examples, the sensor includes atleast one of a proximity sensor, an inertial measurement unit, or alight sensor. In certain examples, the imager is to capture bothhyperspectral imaging data and red-green-blue imaging data to be fusedinto combined imaging data for analysis. In certain examples, the energyintake is based on caloric content of the one or more items to bedetermined by analyzing the spectrum associated with the hyperspectralimaging data. In certain examples, the target is a person, and whereinthe target is to be classified as overeating or undereating. In certainexamples, to classify the target is based on a comparison of the energyintake to a calculation of energy expenditure for the target. In certainexamples, analysis of the capture image data is to include a principalcomponent analysis with a support vector machine including a radialbasis function kernel.

An example hyperspectral imaging data processor includes an imagecalibrator to receive hyperspectral imaging data of one or more fooditems captured with respect to a target in response to a trigger by asensor positioned with respect to the target; a food identifier toidentify one or more food items in the hyperspectral imaging data toprovide one or more identifications; a food quantifier to quantify theone or more food items in the hyperspectral imaging data to provide oneor more quantities; an energy intake calculator to process the one ormore identifications and the one or more quantities to calculate anenergy intake associated with the one or more food items; and a dataprocessor classify the target based on the energy intake.

At least one example non-transitory computer readable storage mediumincludes instructions which, when executed, cause at least one processorto at least: facilitate capture of hyperspectral imaging datarepresenting one or more items with respect to a target, the capturebased on a trigger from a sensor positioned with respect to the target;identify the one or more items; determine composition of the one or moreitems; calculate an energy intake associated with the one or more items;and classify the target based on the energy intake.

An example method of hyperspectral imaging to identify item compositionincludes facilitating acquisition of hyperspectral imaging datarepresenting one or more items with respect to a target, the capturebased on a trigger from a sensor positioned with respect to the target;identifying the one or more items; determining composition of the one ormore items; calculating an energy intake associated with the one or moreitems; and classifying the target based on the energy intake.

An example hyperspectral image analyzer apparatus includes: means forfacilitating acquisition of hyperspectral imaging data representing oneor more items with respect to a target, the capture based on a triggerfrom a sensor positioned with respect to the target; means foridentifying the one or more items; means for determining composition ofthe one or more items; means for calculating an energy intake associatedwith the one or more items; and means for classifying the target basedon the energy intake. The example apparatus can further include meansfor calculating energy expenditure of the target for comparison to theenergy intake to classify the target based on the energy intake andenergy expenditure.

This written description uses examples to disclose the invention,including the best mode, and also to enable any person skilled in theart to practice the invention, including making and using any devices orsystems and performing any incorporated methods. The patentable scope ofthe invention is defined by the claims, and may include other examplesthat occur to those skilled in the art. Such other examples are intendedto be within the scope of the claims if they have structural elementsthat do not differ from the literal language of the claims, or if theyinclude equivalent structural elements with insubstantial differencesfrom the literal language of the claims.

The invention claimed is:
 1. A hyperspectral imaging sensor system toidentify item composition, the system comprising: an imager to capturehyperspectral imaging data of one or more items with respect to a humantarget; a sensor to be positioned with respect to the human target totrigger capture of the image data by the imager based on acharacteristic of the human target; and a processor to prepare thecaptured imaging data for analysis to at least: identify the one or moreitems; determine composition of the one or more items; calculate anenergy intake associated with the one or more items; and classify astatus of the human target based on the energy intake.
 2. The system ofclaim 1, wherein the one or more items include food items.
 3. The systemof claim 1, wherein the characteristic of the human target includes atleast one of chewing or swallowing.
 4. The system of claim 1, whereinthe sensor includes at least one of a proximity sensor, an inertialmeasurement unit, or a light sensor.
 5. The system of claim 1, whereinthe imager is to capture both hyperspectral imaging data andred-green-blue imaging data to be fused into combined imaging data foranalysis.
 6. The system of claim 1, wherein the energy intake is basedon caloric content of the one or more items to be determined byanalyzing the spectrum associated with the hyperspectral imaging data.7. The system of claim 1, wherein classification of the status of thehuman target is based on a comparison of the energy intake to acalculation of energy expenditure for the human target.
 8. The system ofclaim 1, wherein analysis of the capture image data is to include aprincipal component analysis with a support vector machine including aradial basis function kernel.
 9. At least one non-transitory computerreadable storage medium including instructions which, when executed,cause at least one processor to at least: facilitate capture ofhyperspectral imaging data representing one or more items with respectto a human target, the capture based on a trigger from a sensorpositioned with respect to the human target; identify the one or moreitems; determine composition of the one or more items; calculate anenergy intake associated with the one or more items; and classify astatus of the human target based on the energy intake.
 10. The at leastone non-transitory computer readable storage medium of claim 9, whereinthe characteristic of the human target includes at least one of chewingor swallowing.
 11. The at least one non-transitory computer readablestorage medium of claim 9, wherein the sensor includes at least one of aproximity sensor, an inertial measurement unit, or a light sensor. 12.The at least one non-transitory computer readable storage medium ofclaim 9, wherein the imager is to capture both hyperspectral imagingdata and red-green-blue imaging data to be fused into combined imagingdata for analysis.
 13. The at least one non-transitory computer readablestorage medium of claim 9, wherein the energy intake is based on caloriccontent of the one or more items to be determined by analyzing thespectrum associated with the hyperspectral imaging data.
 14. The atleast one non-transitory computer readable storage medium of claim 9,wherein the human target is to be classified as overeating orundereating.
 15. The at least one non-transitory computer readablestorage medium of claim 9, wherein classification of the status of thehuman target is based on a comparison of the energy intake to acalculation of energy expenditure for the human target.
 16. The at leastone non-transitory computer readable storage medium of claim 9, whereinanalysis of the capture image data is to include a principal componentanalysis with a support vector machine including a radial basis functionkernel.
 17. A method of hyperspectral imaging to identify itemcomposition, the method comprising: facilitating acquisition ofhyperspectral imaging data representing one or more items with respectto a human target, the capture based on a trigger from a sensorpositioned with respect to the human target; identifying the one or moreitems; determining composition of the one or more items; calculating anenergy intake associated with the one or more items; and classifying astatus of the human target based on the energy intake.
 18. The method ofclaim 17, wherein the imager is to capture both hyperspectral imagingdata and red-green-blue imaging data to be fused into combined imagingdata for analysis.
 19. The method of claim 17, wherein the energy intakeis based on caloric content of the one or more items to be determined byanalyzing the spectrum associated with the hyperspectral imaging data.20. The method of claim 17, wherein analysis of the capture image datais to include a principal component analysis with a support vectormachine including a radial basis function kernel.